A Probabilistic Word Class Tagging Module Based On Surface Pattern Matching
نویسنده
چکیده
................................................................................................................... 1
منابع مشابه
برچسبگذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی
Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...
متن کاملA Multi-phase Semi-supersense Tagging of Korean Unknown Nouns
Supersense tagging is a problem of finding a corresponding semantic super tag (eg. Phenomenon, Act) based on syntactic information and annotated corpora. However, we employ semantic information rather than syntactic one and annotated corpora, because Korean language has relatively flexible syntactic structure and is lack of annotated corpora. To construct the automatic sense tagging system for ...
متن کاملA Morpheme-based Part-of-Speech Tagger for Chinese
This paper presents a morpheme-based part-of-speech tagger for Chinese. It consists of two main components, namely a morpheme segmenter to segment each word in a sentence into a sequence of morphemes, based on forward maximum matching, and a lexical tagger to label each morpheme with a proper tag indicating its position pattern in forming a word of a specific class, based on lexicalized hidden ...
متن کاملAutomatic Construction of a Chinese Electronic Dictionary
In this paper, an unsupervised approach for constructing a large-scale Chinese electronic dictionary is surveyed. The main purpose is to enable cheap and quick acquisition of a large-scale dictionary from a large untagged text corpus with the aid of the information in a small tagged seed corpus. The basic model is based on a Viterbi reestimation technique. During the dictionary construction pro...
متن کاملMulti-stage Annotation using Pattern-based and Statistical-based Techniques for Automatic Thai Annotated Corpus Construction
An automated or semi-automated annotation is a practical solution towards largescale corpus construction. However, special characteristics of Thai language, such as lack of word-boundary and sentenceboundary markers trigger several issues in automatic corpus annotation. This paper presents a multi-stage annotation framework, containing two stages of chunking and three stages of tagging. Two chu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993